Discriminative Training for Near-Synonym Substitution
نویسندگان
چکیده
Near-synonyms are useful knowledge resources for many natural language applications such as query expansion for information retrieval (IR) and paraphrasing for text generation. However, near-synonyms are not necessarily interchangeable in contexts due to their specific usage and syntactic constraints. Accordingly, it is worth to develop algorithms to verify whether near-synonyms do match the given contexts. In this paper, we consider the near-synonym substitution task as a classification task, where a classifier is trained for each near-synonym set to classify test examples into one of the near-synonyms in the set. We also propose the use of discriminative training to improve classifiers by distinguishing positive and negative features for each nearsynonym. Experimental results show that the proposed method achieves higher accuracy than both pointwise mutual information (PMI) and n-gram-based methods that have been used in previous studies.
منابع مشابه
Discriminative modeling of context-specific amino acid substitution probabilities
2 THE DISCRIMINATIVE MODEL SPACE CONTAINS THE GENERATIVE MODEL SPACE In the following we will show that the generative model with any set of parameters is equivalent to the discriminative model with an appropriately chosen set of parameters. In other words, the discriminative model with these particular parameters predicts the same context-specific substitution probabilities P (a|Ci) as the gen...
متن کاملUnsupervised Phrasal Near-Synonym Generation from Text Corpora
Unsupervised discovery of synonymous phrases is useful in a variety of tasks ranging from text mining and search engines to semantic analysis and machine translation. This paper presents an unsupervised corpus-based conditional model: Near-Synonym System (NeSS) for finding phrasal synonyms and near synonyms that requires only a large monolingual corpus. The method is based on maximizing informa...
متن کاملUsing WordNet synonym substitution to enhance UMLS source integration
OBJECTIVE Synonym-substitution algorithms have been developed for the purpose of matching source vocabulary terms with existing Unified Medical Language System (UMLS) terms during the integration process. A drawback is the possible explosion in the number of newly generated (potential) synonyms, which can tax computational and expert review resources. Experiments are run using a synonym-substit...
متن کاملA Discriminative Learning Model for Coordinate Conjunctions
We propose a sequence-alignment based method for detecting and disambiguating coordinate conjunctions. In this method, averaged perceptron learning is used to adapt the substitution matrix to the training data drawn from the target language and domain. To reduce the cost of training data construction, our method accepts training examples in which complete word-by-word alignment labels are missi...
متن کاملPractical Linguistic Steganography using Contextual Synonym Substitution and a Novel Vertex Coding Method
Linguistic steganography is concerned with hiding information in natural language text. One of the major transformations used in linguistic steganography is synonym substitution. However, few existing studies have studied the practical application of this approach. In this article we propose two improvements to the use of synonym substitution for encoding hidden bits of information. First, we u...
متن کامل